Special Characters


The following is a complete list of all the special characters that can be used in a regular expression. Note that the flags 'g', 'i' and 'gi' can be used after the final slash to specify a global, case-insensitive or global, case-insensitive search respectively. See the RegExp object.
 
\
 
The backslash is used in two ways. Firstly it is used before any letter of the alphabet when not used literally, but to indicate a special character. For example, a regular expression consisting of the letter 't' would be created using /t/, whereas for a tab character you would use /\t/. Secondly it is used before a special character which you want to use literally e.g. the character $, which is used to match a character at the end of a line or of input, would become \$ when used literally.
 
^
 
The caret is used for a match at the beginning of a line or of input. For example, with the string "Association of Carpenters", the regular expression /^A/ would match the initial capital letter 'A' of Association' but not the initial 'A' of 'Association' in the string "Teachers Association".
 
$
 
The dollar sign is used for a match at the end of a line or of input. So, with the string "his cats" the regular expression /s$/ would match the final letter 's' in 'cats' but not in 'his'.
 
*
 
The asterisk is used to match 0 or more occurrences of the preceding character. So, for example, the regular expression /ators*/ would match the 'ator' of "alligator" and the 'ators' of "navigators". However, the regular expression /a*/g, where the asterisk follows a single letter, would match 1 or more occurrences of the letter 'a' throughout the string
 
+
 
The plus sign matches one or more occurrences of the preceding character the first time it appears in a string. As such, it is equivalent to {1,}. For example, the regular expression /e+/ would match the 'e' in 'sped' and the 'e' in 'speed'.
 
?
 
The question mark is used to match the preceding character 0 or 1 times. For example, the regular expression /e?re?/ would match the string "theater" and also the British spelling "theatre".
 
.
 
The decimal point matches any single character except the new line character. So, for instance, with the string "The cat eats moths" the regular expression /.t/gi would match the letters 'at' in 'cat', 'at' in 'eats' and 'ot' in 'moths', but not the initial 'T' of 'The'.
 
(x)
 
Putting a regular expression inside parens causes it to be matched and remembered. Each bracketed expression can then be referenced using the index of the resulting array, or by using the $1, ..., $9 property of the RegExp object.
 
x|y
 
This expression matches either x or y, so, for example, the regular expression /hot|cold/ will match the 'hot' of 'hot potato' and the 'cold' of 'cold potato'.
 
{n}
 
This expression matches 'n' occurrences of the preceding character, where 'n' is an integer. So with the string "the missing snake hisssed" the regular expression /s{2}/g would match both the 'ss' in 'missing' and the first two of the three 's's in 'hisssed'.
 
{n,}
 
This expression matches at least n occurrences of the preceding character, where 'n' is a positive integer. With the string "the missing snake hisssed", the regular expression /s{2,}/g/ would match the 'ss' of 'missing' and the 'sss' of 'hisssed'.
 
{n,m}
 
This expression matches at least 'n' and at most 'm' occurrences of the preceding character, where 'n' and 'm' are positive integers. So, with the string "the missing snake hisssssed", the regular expression /s{2,4}/g will match the 'ss' of 'missing' and the first four 's's of 'hisssssed'.
 
[xyz]
 
This expression matches any one of the set of characters enclosed within the square brackets. A series of characters can be separated by a hyphen: e.g. you can use [abcde] or [a-e]. With the string "the black cat", the regular expression /[abc]/g would match the 'b', 'a' and 'c' in 'black' and the 'c' and 'a' in 'cat'.
 
[^xyz]
 
This expression matches any character other than those following the caret. A series of characters can be separated by a hyphen: e.g. you can use [a-e] instead of [abcde]. With the string "black", the regular expression /[^bla]/ would match the letter 'c'.
 
[\b]
 
This expression matches a backspace character.
 
\b
 
This expression matches any word boundary such as a space. So, for instance, with the string "the black cat", the regular expression /\bc/ would match the 'c' in 'cat', not the one in 'black'.
 
\B
 
This expression is used to match a non-word boundary. For example, with the string "the black cat", the regular expression /\Bc/g would match the 'c' in 'black' but not the one in 'cat'.
 
\cX
 
This expression matches a control character: i.e. a combination of the contol key <CTRL> and any other key represented by the 'X'.
 
\d
 
This expression matches any digit character, and is equivalent to [0-9]. For example, with the string "the 4th of July", the regular expression /\d/ would match the character '4'. (The expression /[0-9]/ would work just the same.)
 
\D
 
This expression matches any non-digit character, and is equivalent to [^0-9]. So, with the string "45X", the regular expression /\D/ would match the letter 'X'. (The expression [^0-9] would work just as well.
 
\f
 
This expression matches a formfeed.
 
\n
 
This expression matches a linefeed.
 
\r
 
This expression matches a carriage return.
 
\s
 
This expression matches any white space character including tab, line feed and form feed. It is equivalent to [ \t\v\f\n\r]. So, for example, with the string "Christmas Day", the expression /\s/ would match the space between the two words.
 
\S
 
This expression is the opposite to \s and matches any non-white space character. it is equivalent to [^ \t\v\f\n\r]. So, with the string "Christmas Day", the expression /\S/ would match the 'C' of 'Christmas'.
 
\t
 
This expression matches a tab character.
 
\v
 
This expression matches a vertical tab character.
 
\w
 
This expression matches any alpha-numeric character including the underscore, and is therefore equivalent to [a-zA-Z0-9_]. For example, with the string "O = 2a", the regular expression /\w/ would match the character '2'.
 
\W
 
This expression is the opposite of \w and matches any character other than an alpha-numeric character or the underscore. It is therefore equivalent to [^a-zA-Z0-9_]. With the string "O = 2a", the expression /\W/ matches the character 'O'.
 
\n
 
Where n is a positive integer, this expression refers to a previous parenthesized substring within a regular expression. (See the (x) expression.) With the string "John, Francis, Bob, Sherri", the regular expression /[E-H]\w*(,)(\s)\w*\1\2/ would match the substring "Francis, Bob, ". This saves you having to type a complicated substring repeated within a regular expression.
 
Note that if n represents a number higher than the number of previous parenthesized substrings, then it is treated as an octal return. See below.
 
\ooctal
 
Where o is an octal escape value, this expression allows you to embed ASCII codes into regular expressions.
 
\xhex
 
Where x is an hexadecimal escape value, this expression allows you to embed ASCII codes into regular expressions.


Copyright 1999-2001 by Infinite Software Solutions, Inc. All rights reserved.
Trademark Information